# A 10-GHz Global Clock Distribution Using Coupled Standing-Wave Oscillators

Frank O'Mahony, *Student Member, IEEE*, C. Patrick Yue, *Member, IEEE*, Mark A. Horowitz, *Fellow, IEEE*, and S. Simon Wong, *Fellow, IEEE* 

Abstract—In this paper, a global clock network that incorporates standing waves and coupled oscillators to distribute a high-frequency clock signal with low skew and low jitter is described. The key design issues involved in generating standing waves on a chip are discussed, including minimizing wire loss within an available technology. A standing-wave oscillator, which is a distributed oscillator that sustains ideal standing waves on lossy wires, is introduced. A clock grid architecture comprised of coupled standing-wave oscillators and differential low-swing clock buffers is presented, along with a compact circuit model for networks of oscillators. The measured results for a prototyped standing-wave clock grid operating at 10 GHz and fabricated in a 0.18- $\mu$ m 6M CMOS logic process are presented. A technique is proposed for on-chip skew measurements with subpicosecond precision.

Index Terms—Clock distribution, coupled oscillators, distributed oscillators, on-chip phase measurement, resonant clocking, salphasic, standing wave.

#### I. INTRODUCTION

THE DESIGN of global clock distributions for multigigahertz microprocessors has become an increasingly difficult and time-consuming task. As the frequency of the global clock continues to increase, the timing uncertainty introduced by the clock network—the skew and jitter—must reduce proportionally with the clock period. However, the clock skew and jitter for conventional buffered H-trees are proportional to latency, which has increased for recent generations of microprocessors [1].

A global clock network that uses standing waves and coupled oscillators has the potential to significantly reduce both skew and jitter. Standing waves have the unique property that phase does not depend on position, meaning that there is ideally no skew. They have previously been used for board-level clock distribution [2], on coaxial cables [3], and on superconducting wires [4], but have never been implemented on chip due to the large losses of on-chip interconnects. Networks of coupled oscillators have a phase-averaging effect that reduces both skew and jitter. However, none of the previous implementations of

Manuscript received March 29, 2003; revised June 14, 2003. This work was supported by Intel Corporation and by the MARCO Interconnect Focus Center.

F. O'Mahony was with the Department of Electrical Engineering, Stanford University, Stanford, CA 94305 USA. He is now with the Circuit Research Laboratory, Intel Corporation, Hillsboro, OR 97124 USA (e-mail: frank.o'mahony@intel.com).

C. P. Yue is with Aeluros Inc., Mountain View, CA 94040 USA.

M. A. Horowitz is with the Department of Computer Science, Stanford University, Stanford, CA 94305 USA.

S. S. Wong is with the Department of Electrical Engineering, Integrated Circuits Laboratory, Stanford University, Stanford, CA 94305 USA.

Digital Object Identifier 10.1109/JSSC.2003.818299



Fig. 1. 10-GHz standing-wave clock distribution network.

coupled-oscillator clock networks use standing waves and some require considerable circuitry to couple the oscillators [5]–[9].

This paper describes the operation and design of a global standing-wave clock distribution network comprised of coupled oscillators and intended for multigigahertz clock frequencies (Fig. 1). Section II explains why generating ideal standing waves on lossy interconnects is difficult and shows how distributed amplification can compensate for these losses. A new type of distributed oscillator that sustains ideal standing waves on lossy interconnects is introduced and a design example is given. Section III describes how these oscillators can be coupled together in a simple way to create a grid of standing waves and proposes a clock buffer that converts the low-swing clock signal to digital levels. A compact circuit model for a network of strongly coupled oscillators is introduced and verified with measured and simulated data in Section IV. Finally, Section V presents the design and results for a 10-GHz standing-wave clock grid fabricated in a 0.18-\mu m six-metal CMOS logic process. The technique used to measure the skew of the prototyped clock grid with subpicosecond resolution is also described.

# II. STANDING-WAVE OSCILLATOR

# A. Standing Waves

A standing wave is formed when two identical waves that are propagating in opposite directions interact. The general case of two waves traveling in opposite directions with arbitrary phase and with amplitudes  $A_1 \geq A_2$  is described by

$$A_1 \cos(\omega t - \beta z) + A_2 \cos(\omega t + \beta z + \phi)$$

$$= 2A_2 \cos\left(\omega t + \frac{\phi}{2}\right) \cos\left(\beta z + \frac{\phi}{2}\right) + \underbrace{(A_1 - A_2)\cos(\omega t - \beta z)}_{\text{Traveling wave}}. (1)$$



Fig. 2. Generating standing waves using a short-circuit termination.

The traveling-wave term in (1) reduces to zero when the amplitudes of the two waves are identical. Unlike a traveling wave, which has phase that varies linearly with position, a standing wave has the same phase regardless of position but amplitude that varies sinusoidally.

A simple way to generate a voltage standing wave is to send an incident wave down a transmission line and reflect it back with a lossless termination such as a short circuit (Fig. 2). However, wire losses cause amplitude mismatch between the incident and reflected waves, resulting in a residual traveling wave. The amplitude of the traveling wave—and, hence, the skew—is directly related to the loss and length of the wire. Previous standing-wave implementations achieve low skew with low-loss transmission lines or distances that are small relative to a wavelength. However, a standing-wave clock network for a multigigahertz microprocessor would need to span multiple wavelengths using lossy on-chip interconnects.

# B. Compensating for Wire Losses

Distributed transconductors can compensate for signal attenuation due to wire loss [5], [10], [11]. The transconductors can be analyzed as a distributed transconductance if they are spaced sufficiently close ( $\leq \lambda/10$ ). The equivalent lumped model for a transmission line with distributed transconductance is shown in Fig. 3, where R, L, and C are the distributed transmission-line parameters and  $G_d$  is the effective transconductance per unit length. The loss is given by

$$\alpha = \text{Re}\sqrt{(R+j\omega L)(-G_d+j\omega C)} \approx \frac{R}{2Z_o} - \frac{G_dZ_o}{2}$$
 (2)

where the approximate expression is valid when  $R \cdot G_d$  is small relative to the other terms in the quadratic. For a correct choice of transconductance in (2), the wire becomes effectively lossless.

# C. Circuit Implementation

Although it is not practical to maintain a precise transconductance over an entire chip, it is possible to design transconductors that will reliably exceed the wire loss. For this reason, it is preferable to use distributed transconductors in an oscillator. The inherent amplitude saturation of the transconductors causes them to self-limit, so they exactly compensate for wire



Fig. 3. Transmission-line model with transconductance.



Fig. 4. Standing-wave oscillator with three cross-coupled pairs.

loss. A previously demonstrated rotary clock distribution used distributed cross-coupled inverters to compensate for wire loss and create on-chip oscillators, but it generated traveling waves (not standing waves) [5]. The circuit in Fig. 4 is a distributed standing-wave oscillator (SWO) that sustains ideal standing waves on lossy wires. Differential transmission lines form a half-wave  $(\lambda/2)$  resonator with the lines shorted together at both ends to provide virtual grounds for differential-mode signals. NMOS cross-coupled pairs provide enough gain to compensate for wire losses, and pMOS diode-connected loads set the common-mode voltage. The parasitic capacitance of the transconductors,  $c_d$ , will load the SWO, thereby increasing the wire loss and decreasing the oscillation frequency. Therefore, the quality of the transconductor can be quantified by  $g_d/c_d$ and should be maximized. Also, wire loss should be minimized in order to minimize power. The SWO will oscillate at the desired frequency,  $\omega_{\rm osc}$ , if it satisfies

$$\alpha(\omega_{
m osc}) < 0$$
 and  $l = \frac{\pi}{\beta(\omega_{
m osc})}$ 

where

$$\alpha = \text{Re}(\gamma) \quad \beta = \text{Im}(\gamma)$$

$$\gamma = \sqrt{(R + j\omega L) \left(\frac{-ng_d}{l} + j\omega \left(C + \frac{nc_d}{l}\right)\right)} \quad . \tag{3}$$

In the above equation,  $\alpha$ ,  $\beta$ , and  $\gamma$  are the transmission-line attenuation, phase, and propagation constants, respectively, l is the length of the resonator, and n is the number of cross-coupled pairs that are distributed along the resonator.

# D. Design Example

An SWO is straightforward to design from (3). Table I lists the parameters that will be used to design an SWO with

| TABLE I                     |         |
|-----------------------------|---------|
| DESIGN PARAMETERS FOR SWO E | EXAMPLE |

| Parameter                        | Value              |
|----------------------------------|--------------------|
| $R (\Omega/\text{mm})$           | 4.65               |
| L (pH/mm)                        | 184                |
| C (fF/mm)                        | 250                |
| $\alpha$ (units/mm @ $f_{osc}$ ) | 0.084 Np / 0.73 dB |
| $g_d$ (mS/unit ccp)              | 3.60 (0.8mA)       |
|                                  | 4.05 (1.0mA)       |
| $c_d$ (fF/unit ccp)              | 76.3 (0.8mA)       |
|                                  | 78.6 (1.0mA)       |
| n (ccps)                         | 5                  |
| f <sub>osc</sub> (GHz)           | 10.0               |



Fig. 5. Loss and length to oscillate at 10 GHz as a function of cross-coupled pair sizing (design point shown).

five equally sized and equally spaced cross-coupled pairs. The transmission-line cross section (Fig. 1) is optimized for minimum loss given a total track width of 32  $\mu$ m and a distance of 3.5  $\mu$ m to the ground plane. A unit cross-coupled pair (ccp) is defined to have nMOS and pMOS devices that are 18- $\mu$ m wide and 0.18- $\mu$ m long and is optimized for maximum  $q_d/c_d$  with 1.0 mA of bias current. First, the resonator length is obtained as a function of cross-coupled pair sizing by solving (3) using the desired oscillation frequency  $f_{\rm osc}$ . Then, the length, transmission-line parameters,  $g_d$ , and  $c_d$  are used in (3) to calculate the effective wire loss (Fig. 5). The design point is conservatively chosen so that the loss becomes zero when the bias current is 0.8 mA. The simulated voltage waveforms spanning from the end to the center of the SWO at increments of l/10 are shown in Fig. 6. The free-running frequency of the oscillator is 9.5 GHz, within 5% of the design goal. Note that the amplitude varies sinusoidally with position and the phase coherence is better than 1 ps. The amplitude and frequency sensitivity to the supply voltage is plotted in Fig. 7. The SWO oscillates for  $V_{\rm supply} \geq 1.5~{\rm V}$  and the amplitude saturates quickly when



Fig. 6. Simulated voltage waveforms for SWO at l/10 increments.



Fig. 7. Simulated SWO amplitude and frequency sensitivity to supply voltage.

 $V_{\mathrm{supply}}$  reaches 1.6–1.7 V. The oscillating frequency is not a strong function of supply voltage (0.56 MHz/mV near 1.8 V) since the parasitic drain capacitance of the cross-coupled pair,  $c_d$ , does not change much with voltage, and the interconnect parameters L, C, and R are constant. Furthermore, the phase constant  $\beta$  is not affected much by small changes in  $g_d$  due to variations in the supply voltage. This low sensitivity to the power supply is a key advantage for SWOs. The delay—and, hence, the skew and jitter—of a buffered H-tree is a strong function of power supply variations due to the large sensitivity of inverter delay.

# III. STANDING-WAVE CLOCK GRID

# A. Coupling and Injection-Locking SWOs

On-chip transmission-line resonators have an inherently modest quality factor (Q) that allows coupling and injection locking of the SWOs over a range of frequencies. For the SWO in the previous design example, the loaded Q is 2.7. SWOs can be coupled together by simply connecting their transmission lines. The coupling strength is largest when the oscillators are connected at the center and zero near the ends.



Fig. 8. Simulated voltage standing waves for prototyped clock grid.

Any detuning between coupled oscillators results in skew that is directly related to the coupling strength and Q [12]. Therefore, low-Q resonators that are strongly coupled should be used for clock distribution. These oscillators can also be injection locked to a reference signal. Injection locking allows the clock frequency to be dictated by an external clock source such as a phase-locked loop (PLL) and stabilizes the otherwise noisy signal of this low-Q oscillator. In this work, a reference signal is ac coupled into the gate of the pMOS loads at the center of an SWO. The locking range and the skew caused by driving an injection-locked oscillator off-resonance is also related to the coupling strength [12]. Again, strong coupling is preferable for low skew and a wide locking range.

# B. Grid Architecture

A resonant grid of coupled SWOs is shown in Fig. 1. Choosing the coupling point is a tradeoff between the size of the grid and the coupling strength. Connecting the SWOs 15%–20% from the short circuits provides strong enough coupling to lock the segments together without causing excessive skew due to mismatches between SWOs. To make a grid pattern, the ends of the SWOs are folded at right angles to the grid. Due to the sinusoidal amplitude envelope of standing waves, the folded segments have low voltage amplitudes and, hence, are inappropriate for recovering the clock. For practical microprocessor clock distributions, field-effect transistor (FET) switches could be used to short the ends of the SWOs during normal operation and turned off to facilitate low-frequency testing. The voltage standing-wave pattern for a portion of the grid is shown in Fig. 8.

## C. Phase Averaging in Grids

Within a grid of coupled oscillators, phase is averaged at each coupling point. Phase differences among the SWOs, either skew due to mismatch or jitter due to power supply variations, are reduced by this averaging process. In our approach, each SWO is coupled in up to three locations. The averaging effect is directly related to the coupling strength. In order to test how well the coupled oscillators suppress jitter caused by localized supply noise, we simulated a single SWO, a grid of four SWOs, and the full grid in Fig. 1, all injection locked at 10 GHz. In each



Fig. 9. Clock buffer.



Fig. 10. Simulated clock buffer performance.

case, the power supply for one cross-coupled pair was reduced by 10% with a 100-ps fall time. The resulting period jitter on the oscillator network was 0.41, 0.26, and 0.17 ps, respectively, confirming that the phase averaging property of a coupled network of SWOs reduces jitter.

# D. Clock Buffer

Standing-wave clock distribution is intended to interface with a conventional digital clock distribution at lower levels of the clock hierarchy. Therefore, a buffer is required to convert the low-swing differential sinusoids to digital levels without adding significant amounts of timing error due to variations of the input amplitude. A two-stage clock buffer based on [13] is shown in Fig. 9. The first stage is a differential pair with a small gate overdrive, allowing complete current switching even for the smallest expected input amplitude. It amplifies and limits the signal so the output amplitude is roughly independent of the input amplitude. A low-pass filter attenuates the harmonics added by the limiting amplifier that would otherwise cause amplitude-dependent skew. The second stage is a sine-to-square converter that uses cross-coupled inverters and a shunt resistor to achieve a well-controlled 50% duty cycle across process, temperature, frequency, and supply variations. Because the 0.18- $\mu$ m devices chosen for demonstration are not adequate to test the clock buffer at 10 GHz, the buffer was simulated with a



Fig. 11. SWO model.

2-GHz sinusoidal input clock. This clock period corresponds to an aggressive seven fanout-4 (FO4) delays in this technology. The clock buffer exhibits 5.9-ps skew (1.2% of the clock cycle) for the 30% voltage variation seen across the center 50% of a standing wave (Fig. 10). Assuming similar performance using devices in a future process capable of 10-GHz operation, the amplitude-dependent skew will be about 1 ps.

#### IV. MODELING

Networks of discrete coupled oscillators can be used to accurately model SWO grids. The circuit in Fig. 11 models two half-SWOs and is defined so that the coupling point between them is at the center of the circuit (not at the ends). The behavior of the transmission-line resonators is well approximated near resonance by RLC tanks that are inductively coupled. The cross-coupled pairs are lumped and modeled as nonlinear transconductors, $-g_d(A)$ , using a third-order polynomial to approximate the saturation behavior [14] and scaling their I-V curves based on their position along the SWO.  $I_{\rm inj1}$  and  $I_{\rm inj2}$  represent currents that are injected from adjacent SWO segments or an external current source (such as the locking signal). Following a similar derivation as [12], the coupled differential equations describing the model are

$$\frac{dV_1}{dt} = \frac{1}{1 + \frac{1}{2}(k+1)}$$

$$\cdot \left[ V_1 \left[ j\omega_o(k+1) + \frac{\omega_o}{Q} \mu \left( \alpha_o^2 - |V_1|^2 \right) \right] + \frac{\omega_o}{Q} I_{\text{inj1}} R_L + \frac{dV_2}{dt} \frac{1}{2}(k-1) - V_2 j\omega_o(k-1) \right]$$

$$\frac{dV_2}{dt} = \frac{1}{1 + \frac{1}{2}(k+1)}$$

$$\cdot \left[ V_2 \left[ j\omega_o(k+1) + \frac{\omega_o}{Q} \mu \left( \alpha_o^2 - |V_2|^2 \right) \right] + \frac{\omega_o}{Q} I_{\text{inj2}} R_L + \frac{dV_1}{dt} \frac{1}{2}(k-1) - V_1 j\omega_o(k-1) \right]$$
(4)

where  $\omega_o$  is the resonant frequency, Q is the loaded quality factor of the resonator, and  $\mu$  is the parameter that describes the nonlinearity of the transconductance. The coupling parameter k is defined to be  $(L_A + L_B)/L_A$ . Note that k is infinite when the two resonators are completely coupled and unity when they are completely uncoupled, which is consistent with (4). For a stub length,  $l_c$ , that is l/5 or less—where l is the length of the SWO—k can be approximated by

$$k \approx \left(1 - \frac{\pi}{4} \tan\left(\frac{\pi l_c}{l}\right)\right)^{-1}.$$
 (5)



Fig. 12. Simulated (—) and modeled (--) skew for four coupled SWOs with detuning.



Fig. 13. Schematic of test chip grid with eight SWOs.

Using (4), the transient response for a grid of SWOs can be calculated by solving a matrix of coupled, differential equations. Fig. 12 illustrates the agreement between the simulated and modeled steady-state phase for a grid of four SWOs (using the SWO designed in Section II) that are coupled l/5 from the short circuits and detuned by varying the lengths of two of the SWOs by  $\pm 10\%$ . The skew is referenced to the phase of the two center points. Positive skew indicates lagging phase and negative skew indicates leading phase.

#### V. EXPERIMENT

A 10-GHz clock network comprised of eight coupled SWOs was prototyped in a 0.18- $\mu$ m 1.8-V CMOS process with six AlCu metal layers (Fig. 13). Clock buffers were not integrated due to the speed limitations of 0.18- $\mu$ m devices at 10 GHz but will be easily integratable when devices are scaled for 10-GHz operations. The differential  $\lambda/2$  lines are 3-mm long, 14- $\mu$ m wide, and are spaced 4  $\mu$ m apart in metal six. Although the design parameters are identical to the ones used in the example in Section II, the additional loading of testing and tuning circuitry and layout parasitics reduces the length required to oscillate



Fig. 14. Die micrograph.



Fig. 15. Test chip layout and timing measurement.

at 10 GHz and increases the necessary transconductance. Each SWO consists of five cross-coupled pairs with 90- $\mu$ m-wide devices. The transconductance of each cross-coupled pair is variable from 0 to 25 mS by changing the bias current; 18 mS is required to start oscillation. The grid is tunable from 9.8 to 10.5 GHz (6.4% range) with accumulation-mode MOS varactors that are positioned 400  $\mu$ m from the ends of the SWOs. Grid tuning extends the locking range and facilitates intentional skewing of specific grid segments for testing purposes. The total SWO capacitance is 48 pF, of which only 12 pF is from the interconnect itself. The considerable 36 pF of self-loading from the cross-coupled pairs will significantly reduce as the design is ported to future generations of devices. The SWOs consume 378 mW (5.25 mA/ccp) which is comparable to the 389 mW of  $CV^2f$  power required to drive the 12-pF interconnect capacitance digitally at 10 GHz. The measured sensitivity to power supply variations is 0.33 MHz/mV near 1.8 V, even less than expected from Section III, and oscillations are possible for supply voltages down to 1.4 V. A die micrograph of the prototyped clock grid is shown in Fig. 14.

On-chip skew is measured with a homodyne technique that converts phase into dc voltage (Fig. 15). The clock signal is tapped at eight points around the grid and routed through length-matched transmission lines and multiplexers to a pair of mixers. The mixers compare the phase of each clock signal to a reference phase that is set by a calibrated off-chip phase shifter. The grid is folded to minimize the distance from tapping points to the



Fig. 16. Measured skew with different injection amplitudes.

mixers. The maximum sensitivity was measured to be 60 fs/mV by varying the reference phase with the phase shifter and observing the resulting change in the differential dc output voltage.

Clock skew was measured while individually sweeping three variables: the amplitude of the injected signal, the frequency of the injected signal, and the tuning of the grid. First, the grid was tuned—using a single control voltage for all of the varactors—to 10.00 GHz and injection locked by an externally generated reference signal with amplitudes of 63, 125, and 250 mV. Fig. 16 shows the skew across the grid as a function of the injected signal amplitude. The perspective for this graph is looking sideways toward the long side of the grid. The clock phase around the grid is largely invariant to the amplitude of the injected signal except at the injection point. The phase at the injection point varies by 1.6 ps over the range of input amplitudes. Lower locking amplitude results in less global skew, but there is a tradeoff since the locking bandwidth is proportional to the amplitude of the injected signal. The locking bandwidth is 0.7%, 1.5%, and 3.7%, respectively, for the injected signal amplitudes.

Next, the skew was measured while using a 125-mV injected signal to sweep the grid across its locking range. Fig. 17 shows that driving the grid off-resonance causes a skew gradient. As discussed in Section II, stronger coupling can reduce this gradient, but will also shrink the size of the grid since more of the SWO length will be in the stubs. The skew calculated using the model from Section IV is also plotted in Fig. 17 and shows good agreement with the measured results (<1.5-ps difference). The model captures the shape of the skew gradient and the effects of injection locking.

Finally, the skew was measured when one-half of the grid was detuned by 1% to 10.10 GHz using the on-chip varactors. The resulting free-oscillating frequency is 10.05 GHz, which is the ensemble average of the oscillators. The grid was injection locked with a 125-mV signal from 10.00 to 10.10 GHz and the measured skew is shown in Fig. 18. These results illustrate how SWO mismatches cause skew gradients across the grid. Note that the portion of the grid tuned to 10.10 GHz leads the portion tuned to 10.00 GHz in phase.



Fig. 17. Measured (—) and modeled (——) skew over locking range.



Fig. 18. Measured skew for detuned grid.

The clock jitter was measured for the same three locking amplitudes while the grid was swept across the corresponding locking ranges. The results indicate that jitter is nearly constant except at the edges of the locking range, where it increases rapidly (Fig. 19). The resolution of this measurement was limited by the jitter of the signal generator that provided the ref-



Fig. 19. Measured clock jitter for different locking amplitude and frequency.

erence clock. The measured jitter for the signal generator was 1.5-ps rms and is shown as the baseline in Fig. 19. Based on the jitter near the center of the locking range, the jitter added by the clock grid was about 0.8-ps rms.

## VI. CONCLUSION

The first on-chip standing-wave clock distribution has been demonstrated. This approach benefits from the invariant phase property of standing waves and the phase averaging effect of coupled oscillators. A method for overcoming on-chip interconnect losses to generate ideal standing waves has been presented. The standing-wave oscillators can be coupled together to form a clock grid that injection locks to an external clock source. A model for networks of strongly coupled oscillators was introduced and verified with measured and simulated data. A 10-GHz clock grid was demonstrated that achieves low skew and jitter. Based on these results, we believe that standing-wave clock distribution will be an attractive and scalable alternative to H-trees for future microprocessors as clock frequency scales to 10 GHz and beyond.

## ACKNOWLEDGMENT

The authors would like to thank R. Chang, N. Talwalkar, B. Kleveland, and T. Soorapanth of Stanford University for helpful discussions, K. Soumyanath and M. Anders at Intel Corporation for support, and Taiwan Semiconductor Manufacturing Company for fabrication.

## REFERENCES

- [1] P. J. Restle *et al.*, "A clock distribution network for microprocessors," *IEEE J. Solid-State Circuits*, vol. 36, pp. 792–799, May 2001.
- [2] V. L. Chi, "Salphasic distribution of clock signals for synchronous systems," *IEEE Trans. Comput.*, vol. 43, pp. 597–602, May 1994.
- [3] M. E. Becker and T. F. Knight Jr., "Transmission line clock driver," in Proc. IEEE Int. Conf. Computer Design, Oct. 1999, pp. 489–490.
- [4] M. Hosoya, W. Hioe, K. Takagi, and E. Goto, "Operation of a 1-bit quantum flux parametron shift register (latch) by 4-phase 36-GHz clock," *IEEE Trans. Appl. Superconduct.*, vol. 5, pp. 2831–2834, June 1995.

- [5] J. Wood, T. C. Edwards, and S. Lipa, "Rotary traveling-wave oscillator arrays: a new clock technology," *IEEE J. Solid-State Circuits*, vol. 36, pp. 1654–1665, Nov. 2001.
- [6] I. Galton, D. A. Towne, J. J. Rosenberg, and H. T. Jensen, "Clock distribution using coupled oscillators," in *Proc. IEEE Int. Symp. Circuits and Systems*, vol. 3, May 1996, pp. 217–220.
- [7] L. Hall, M. Clements, W. Liu, and G. Bilbro, "Clock distribution using cooperative ring oscillators," in *Proc. 17th Conf. Advanced Research in VLSI*, Sept. 1997, pp. 15–16.
- [8] V. Gutnik and A. P. Chandrakasan, "Active GHz clock network using distributed PLLs," *IEEE J. Solid-State Circuits*, vol. 35, pp. 1553–1560, Nov. 2001.
- [9] M. Saint-Laurent, M. Swaminathoan, and J. D. Meindl, "On the microarchitectural impact of clock distribution using multiple PLLs," in *Proc. IEEE Int. Conf. Computer Design*, Sept. 2001, pp. 214–220.
- [10] M. Bussmann and U. Langmann, "Active compensation of interconnect losses for multi-GHz clock distribution networks," *IEEE Trans. Circuits Syst. II*, vol. 39, pp. 790–798, Nov. 1992.
- [11] S. Deibele and J. B. Beyer, "Attenuation compensation in distributed amplifier design," *IEEE Trans. Microwave Theory Tech.*, vol. MTT-37, pp. 1425–1433, Sept. 1989.
- [12] R. A. York, "Nonlinear analysis of phase relationships in quasi-optical oscillator arrays," *IEEE Trans. Microwave Theory Tech.*, vol. 41, pp. 1799–1809, Oct. 1993.
- [13] A. Maxim, B. Scott, E. M. Schneider, M. L. Hagge, S. Chacko, and D. Stiurca, "A low jitter 125–1250 MHz process independent and ripple-poleless 0.18-μm CMOS PLL based on a sample-reset loop filter," *IEEE J. Solid-State Circuits*, vol. 36, pp. 1673–1683, Nov. 2001.
- [14] B. Van der Pol, "The nonlinear theory of electric oscillations," *Proc. IRE*, vol. 22, pp. 1051–1085, Sept. 1934.



C. Patrick Yue (S'93–M'99) received the B.S. degree in electrical engineering with highest honors from the University of Texas at Austin in 1992 and the M.S. and Ph.D. degrees in electrical engineering from Stanford University, Stanford, CA, in 1994 and 1998, respectively. His doctoral thesis focused on integration of spiral inductors for Si-based RF ICs.

Dr. Yue has held summer positions at Texas Instruments Incorporated and Hewlett Packard Laboratories, in 1993 and 1994, respectively. After completing his Ph.D. degree in 1998, he worked at the Center for

Integrated Systems of Stanford University as a Research Associate conducting research on high-frequency interconnect design. In November 1998, he assisted in founding Atheros Communications, Sunnyvale, CA, where he was an Analog Design and Modeling Manager in charge of CMOS device modeling and circuit design for wireless local area network applications. He was a core member of the team that delivered the world's first IEEE 802.11a CMOS RF transceiver for volume production. In September 2002, he joined Aeluros Inc., Mountain View, CA, where he works on device modeling and signal integrity issues for 10-Gb/s serial links. He has been a Consulting Assistant Professor with Stanford University since 2001. He has authored or coauthored more than 25 articles and contributed to one book chapter in the area of CMOS RF device modeling and circuit design. He holds three U.S. patents and has several pending applications.



Mark A. Horowitz (S'77–M'78–SM'95–F'00) received the B.S. and M.S. degrees in electrical engineering from the Massachusetts Institute of Technology, Cambridge, in 1978, and the Ph.D. degree from Stanford University, Stanford, CA, in 1984.

He is the Yahoo Founder's Professor of Electrical Engineering and Computer Science at Stanford University. His research area is in digital system design, and he has led a number of processor designs including MIPS-X, one of the first processors

to include an on-chip instruction cache, TORCH, a statically scheduled, superscalar processor that supported speculative execution, and FLASH, a flexible DSM machine. He has also worked in a number of other chip design areas, including high-speed and low-power memory design, high-bandwidth interfaces, and fast floating point. In 1990, he took leave from Stanford to help start Rambus Inc., Los Altos, CA, a company designing high-bandwidth memory interface technology. His current research includes multiprocessor design, low-power circuits, memory design, and high-speed links.

Dr. Horowitz received the Presidential Young Investigator Award and an IBM Faculty Development Award in 1985. In 1993, he received the Best Paper Award from the IEEE International Solid-State Circuits Conference.



**Frank O'Mahony** (S'00) received the B.S., M.S., and Ph.D. degrees in electrical engineering from Stanford University, Stanford, CA, in 1997, 2000, and 2003, respectively. His thesis focused on standing-wave clock distribution for high-performance microprocessors.

He is currently with Intel Corporation's Microprocessor Research Laboratories, Hillsboro, OR, researching high-speed signaling technologies.

Dr. O'Mahony received the Intel Foundation Ph.D. Fellowship Award while at Stanford.



S. Simon Wong (S'77–M'83–SM'91–F'99) received the B.E.E. and B.M.E. degrees from the University of Minnesota at Minneapolis in 1975 and 1976, respectively, and the M.S. and Ph.D. degrees from the University of California at Berkeley in 1978 and 1983, respectively.

From 1978 to 1980, he was with National Semiconductor Corporation designing MOS dynamic memories. From 1980 to 1985, he was with Hewlett Packard Laboratories working on advanced MOS technologies. From 1985 to 1988, he was

an Assistant Professor in the School of Electrical Engineering at Cornell University, Ithaca, NY. In 1988, he joined Stanford University, Stanford, CA, where he is currently a Professor of electrical engineering. His current research concentrates on interconnect technologies, and design and modeling of high-frequency interconnect structures and devices.